Mining evolutionary dependencies from web-localization repositories

نویسندگان

  • Huzefa H. Kagdi
  • Jonathan I. Maletic
چکیده

An approach to mining repositories of web-based user documentation for patterns of evolutionary change in the context of internationalization and localization is presented. Localized web documents that are frequently co-changed (i.e., an evolutionary dependency) during the natural language translation process are uncovered to support the future evolution of the system. A sequential-pattern mining technique is used to uncover patterns from version histories. Characteristics of the uncovered patterns such as size, frequency, and occurrence within a single natural language or across multiple languages are discussed. Such patterns help provide an insight into the effort required in retranslation due to a change in the documentation. The approach is validated on the open source K Desktop Environment (KDE) system. KDE maintains documentation for over 50 different natural languages and presents a prime example of the problem. The technique accurately predicts which documents in KDE are retranslated or updated in future versions. Copyright © 2007 John Wiley & Sons, Ltd.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Technique for Improving Web Mining using Enhanced Genetic Algorithm

World Wide Web is growing at a very fast pace and makes a lot of information available to the public. Search engines used conventional methods to retrieve information on the Web; however, the search results of these engines are still able to be refined and their accuracy is not high enough. One of the methods for web mining is evolutionary algorithms which search according to the user interests...

متن کامل

Using Pig as a data preparation language for large-scale mining software repositories studies: An experience report

The Mining Software Repositories (MSR) field analyzes software repository data to uncover knowledge and assist development of ever growing, complex systems. However, existing approaches and platforms for MSR analysis face many challenges when performing large-scale MSR studies. Such approaches and platforms rarely scale easily out of the box. Instead, they often require custom scaling tricks an...

متن کامل

Analysing Software Repositories to Understand Software Evolution

Software repositories such as versioning systems, defect tracking systems, and archived communication between project personnel are used to help manage the progress of software projects. Software practitioners and researchers increasingly recognize the potential benefit of mining this information to support the maintenance of software systems, improve software design or reuse, and empirically v...

متن کامل

Discovering data dependencies in Web content mining

Web content mining opens up the possibility to use data presented in web pages for the discovery of interesting and useful patterns. Our web mining tool, FBL (Filtered Bayesian Learning), performs a two stage process: first it analyzes data present in a web page, and then, using information about the data dependencies encountered, it performs the mining phase based on bayesian learning. The Näi...

متن کامل

From Static to Dynamic Web Usage Mining: Towards Scalable Profiling and Personalization with Evolutionary Computation

In addition to its ever-expanding size and lack of structure, the World Wide Web has not been responsive to user preferences and interests. Personalization deals with tailoring a user’s interaction with the Web information space based on information about him/her. Mass profiling is based on general trends of usage patterns (thus protecting privacy) compiled from all users on a site, and can be ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Software Maintenance

دوره 19  شماره 

صفحات  -

تاریخ انتشار 2007